The main function of the package that performs a gene set analysis for a list of gene sets.
geneSetAnalysis(
...,
dat,
geneSets,
analysis,
signLevel = 0.05,
preprocessGeneSets = FALSE,
adjustmentMethod = p.adjust.methods,
cluster = NULL)
An object of the type gsaResult
with the following elements:
A vector of p-values, one for each gene set. These values are already adjusted for multiple testing according to the adjustmentMethod
parameter.
The raw unadjusted p-values, one for each gene set.
A list comprising the detailed results for each gene set. Each element of this list is another list with the following components:
pValue
: The raw (unadjusted) p-value for the gene set.
geneSetValues
: If analysis
is a global analysis, this is the object returned by the method for the corresponding gene set. For an analysis pipeline, this holds the values of the gene-level statistic, the transformed values and the values of the gene set statistic (see also gsAnalysis
).
significanceValues
: Gene set statistics for each randomly drawn gene set for significance assessment and a list
of this gene sets. Only set for analysis of type 'geneSetAnalysis'. NULL
for 'global' analysis.
geneSet
: The supplied gene set.
The significance level used for this analysis.
The performed analysis (of type gsAnalysis
).
A character string identifying the analysis as an enrichment analysis pipeline ("geneSetAnalysis"
) or as a global analysis ("global"
).
The method used to adjust the p-values in adjustedPValues
Additional parameters for the different steps of the analysis pipeline, depending on the concrete configuration supplied in analysis
.
A numeric matrix of gene expression values for all analyzed genes. Here, each row corresponds to one gene, and each column corresponds to one sample. The rows must be named with the gene names used in the gene sets.
A list of gene sets, where each gene set is a vector of gene names corresponding to the row names of dat
.
An object of type gsAnalysis
as returned by gsAnalysis
or by the predefined configurations (see predefinedAnalyses
).
The significance level for the significance assessment of the gene sets (defaults to 0.05
).
Specifies whether the gene sets in geneSets
should be preprocessed or not. If set to TRUE
, all genes that are not part of the data set (i.e. not in rownames(dat)
) are removed from the gene sets.
The method to use for the adjustment for multiple testing (see method
parameter of p.adjust
for possible values).
If the analyses should be applied in parallel for the gene sets, this parameter must hold an initialized cluster as returned by makeCluster
. If this parameter is NULL
, the analyses are performed sequentially.
This is the main interface function of the package for gene set enrichment analyses. Analyses usually consist of a pipeline of steps. Often, the first step is the calculation of a summary statistic for the relation of each gene to the class labels. These values or transformations thereof are employed to calculate a gene set statistic for each of the supplied gene sets. The significance of gene set enrichments can be determined according to different methods, and the robustness of gene sets can be evaluated by slightly modifying the gene sets. To provide a flexible mechanism for the plethora of different approaches arising from the different choices, basic pipeline configurations are encapsulated in gsAnalysis
objects which can be created using the gsAnalysis
function. Ready-to-use configuration objects for certain well-known methods are included in the package (see predefinedAnalyses
). Parameters of the chosen analysis pipeline can be set in the ...
parameter.
Ackermann, M., Strimmer, K. (2009) A general modular framework for gene set enrichment analysis. BMC Bioinformatics, 10(1), 47.
gsAnalysis
, gls
, transformation
, gss
, global
, significance
, evaluateGeneSetUncertainty
, hist.gsaResult
, preprocessGs
# load data
data(exampleData)
# apply predefined analysis for gene set enrichment analysis
res <- geneSetAnalysis(
# parameters for geneSetAnalysis
dat = countdata,
geneSets = pathways[1],
analysis = analysis.averageCorrelation(),
adjustmentMethod = "fdr",
# additional parameters for analysis.averageCorrelation
labs = labels,
method = "pearson",
numSamples = 10)
Run the code above in your browser using DataLab